智能论文笔记

An Efficiency Study for SPLADE Models

Carlos Lassance , Stéphane Clinchant

分类：自然语言处理

2022-07-08

在根据多个硬件和软件测试方案的原因评估IR模型时，通常会忽略潜伏期和效率问题。然而，效率是此类系统的重要组成部分，不应被忽视。在本文中，我们专注于提高SPLADE模型的效率，因为它已经在TREC收集方面取得了最新的零击性能和竞争成果。可以通过正则化因子来控制SPADE效率，但仅控制这种正则化的效率不够有效。为了减少Splade和传统检索系统之间的延迟差距，我们提出了几种技术，包括查询的L1正则化，文档/查询编码器的分离，由FLOPS进行了调节的中间训练以及使用更快的查询编码器的使用。我们的基准表明，我们可以大大提高这些模型的效率，同时增加对内域数据的性能指标。据我们所知，{我们提出了第一个神经模型，在相同的计算约束下，\ textit {实现与传统bm25}相似的延迟（小于4ms差异），而具有\ textit {相似的性能（小于10 \％MRR） @10减少）}作为最新的单阶段神经排名者在域中数据}。

translated by 谷歌翻译

A Study on Token Pruning for ColBERT

Carlos Lassance , Maroua Maachou , Joohee Park , Stéphane Clinchant

分类：自然语言处理

2021-12-13

最近已提出COLBert模型作为基于有效的伯特伯爵的排名。通过采用迟到的互动机制，COLBert的主要优势是文件表示可以预先预先计算。但是，该模型的大缺陷是索引大小，其与集合中的令牌数量线性缩放。在本文中，我们研究了COLBERT模型的各种设计，以攻击此问题。虽然已经探索了压缩技术以减少指数大小，但在本文中，我们研究了COLBERT的令牌修剪技术。我们比较简单的启发式机器，以及一层注意机制，选择令牌以保持索引时间。我们的实验表明，COLBert指标可以在MS Marco Conserfer集合上修剪高达30 \％，而无需显着下降。最后，我们在MS MARCO文件上实验，揭示了这种机制的几个挑战。

translated by 谷歌翻译

TLDR: Twin Learning for Dimensionality Reduction

Yannis Kalantidis , Carlos Lassance , Jon Almazan , Diane Larlus

分类：计算机视觉 | 人工智能 | 机器学习

2021-10-18

降低降低方法是无监督的方法，它学习了低维空间，在这些方法中，初始空间的某些特性（通常是“邻居”的概念）被保留。这种方法通常需要在大的K-NN图或复杂的优化求解器上传播。另一方面，通常用于从头开始学习表示形式，依靠简单，更可扩展的框架来学习的自我监督学习方法。在本文中，我们提出了TLDR，这是通用输入空间的一种降低方法，该方法正在移植Zbontar等人的最新自我监督学习框架。（2021）降低维度的特定任务，超越任意表示。我们建议使用最近的邻居从训练组中构建对，并减少冗余损失，以学习在此类对之间产生表示形式的编码器。 TLDR是一种简单，易于训练和广泛适用性的方法。它由一个离线最近的邻居计算步骤组成，该步骤可以高度近似，并且是一个直接的学习过程。为了提高可伸缩性，我们专注于提高线性维度的降低，并在图像和文档检索任务上显示一致的收益，例如在Roxford上获得PCA的 +4％地图，用于GEM-AP，改善了ImageNet上的Dino的性能或以10倍的压缩保留。

translated by 谷歌翻译

An Empirical Investigation into the Use of Image Captioning for Automated Software Documentation

Kevin Moran , Ali Yachnes , George Purnell , Junayed Mahmud , Michele Tufano , Carlos Bernal-Cárdenas , Denys Poshyvanyk , Zach H'Doubler

分类：人工智能 | 计算机视觉 | 机器学习

2023-01-03

Existing automated techniques for software documentation typically attempt to reason between two main sources of information: code and natural language. However, this reasoning process is often complicated by the lexical gap between more abstract natural language and more structured programming languages. One potential bridge for this gap is the Graphical User Interface (GUI), as GUIs inherently encode salient information about underlying program functionality into rich, pixel-based data representations. This paper offers one of the first comprehensive empirical investigations into the connection between GUIs and functional, natural language descriptions of software. First, we collect, analyze, and open source a large dataset of functional GUI descriptions consisting of 45,998 descriptions for 10,204 screenshots from popular Android applications. The descriptions were obtained from human labelers and underwent several quality control mechanisms. To gain insight into the representational potential of GUIs, we investigate the ability of four Neural Image Captioning models to predict natural language descriptions of varying granularity when provided a screenshot as input. We evaluate these models quantitatively, using common machine translation metrics, and qualitatively through a large-scale user study. Finally, we offer learned lessons and a discussion of the potential shown by multimodal models to enhance future techniques for automated software documentation.

translated by 谷歌翻译

Measuring and Estimating Key Quality Indicators in Cloud Gaming services

Carlos Baena , O. S. Peñaherrera-Pulla , Raquel Barco , Sergio Fortes

分类：机器学习

2022-12-28

User equipment is one of the main bottlenecks facing the gaming industry nowadays. The extremely realistic games which are currently available trigger high computational requirements of the user devices to run games. As a consequence, the game industry has proposed the concept of Cloud Gaming, a paradigm that improves gaming experience in reduced hardware devices. To this end, games are hosted on remote servers, relegating users' devices to play only the role of a peripheral for interacting with the game. However, this paradigm overloads the communication links connecting the users with the cloud. Therefore, service experience becomes highly dependent on network connectivity. To overcome this, Cloud Gaming will be boosted by the promised performance of 5G and future 6G networks, together with the flexibility provided by mobility in multi-RAT scenarios, such as WiFi. In this scope, the present work proposes a framework for measuring and estimating the main E2E metrics of the Cloud Gaming service, namely KQIs. In addition, different machine learning techniques are assessed for predicting KQIs related to Cloud Gaming user's experience. To this end, the main key quality indicators (KQIs) of the service such as input lag, freeze percent or perceived video frame rate are collected in a real environment. Based on these, results show that machine learning techniques provide a good estimation of these indicators solely from network-based metrics. This is considered a valuable asset to guide the delivery of Cloud Gaming services through cellular communications networks even without access to the user's device, as it is expected for telecom operators.

translated by 谷歌翻译

On the Level Sets and Invariance of Neural Tuning Landscapes

Binxu Wang , Carlos R. Ponce

分类：人工智能 | 计算机视觉 | 神经与进化计算

2022-12-26

Visual representations can be defined as the activations of neuronal populations in response to images. The activation of a neuron as a function over all image space has been described as a "tuning landscape". As a function over a high-dimensional space, what is the structure of this landscape? In this study, we characterize tuning landscapes through the lens of level sets and Morse theory. A recent study measured the in vivo two-dimensional tuning maps of neurons in different brain regions. Here, we developed a statistically reliable signature for these maps based on the change of topology in level sets. We found this topological signature changed progressively throughout the cortical hierarchy, with similar trends found for units in convolutional neural networks (CNNs). Further, we analyzed the geometry of level sets on the tuning landscapes of CNN units. We advanced the hypothesis that higher-order units can be locally regarded as isotropic radial basis functions, but not globally. This shows the power of level sets as a conceptual tool to understand neuronal activations over image space.

translated by 谷歌翻译

Principled and Efficient Transfer Learning of Deep Models via Neural Collapse

Xiao Li , Sheng Liu , Jinxin Zhou , Xinyu Lu , Carlos Fernandez-Granda , Zhihui Zhu , Qing Qu

分类：机器学习 | 人工智能 | 计算机视觉 | (统计)机器学习

2022-12-23

With the ever-growing model size and the limited availability of labeled training data, transfer learning has become an increasingly popular approach in many science and engineering domains. For classification problems, this work delves into the mystery of transfer learning through an intriguing phenomenon termed neural collapse (NC), where the last-layer features and classifiers of learned deep networks satisfy: (i) the within-class variability of the features collapses to zero, and (ii) the between-class feature means are maximally and equally separated. Through the lens of NC, our findings for transfer learning are the following: (i) when pre-training models, preventing intra-class variability collapse (to a certain extent) better preserves the intrinsic structures of the input data, so that it leads to better model transferability; (ii) when fine-tuning models on downstream tasks, obtaining features with more NC on downstream data results in better test accuracy on the given task. The above results not only demystify many widely used heuristics in model pre-training (e.g., data augmentation, projection head, self-supervised learning), but also leads to more efficient and principled fine-tuning method on downstream tasks that we demonstrate through extensive experimental results.

translated by 谷歌翻译

Ensemble learning techniques for intrusion detection system in the context of cybersecurity

Andricson Abeline Moreira , Carlos A. C. Tojeiro , Carlos J. Reis , Gustavo Henrique Massaro , Igor Andrade Brito e Kelton A. P. da Costa

分类：机器学习

2022-12-21

Recently, there has been an interest in improving the resources available in Intrusion Detection System (IDS) techniques. In this sense, several studies related to cybersecurity show that the environment invasions and information kidnapping are increasingly recurrent and complex. The criticality of the business involving operations in an environment using computing resources does not allow the vulnerability of the information. Cybersecurity has taken on a dimension within the universe of indispensable technology in corporations, and the prevention of risks of invasions into the environment is dealt with daily by Security teams. Thus, the main objective of the study was to investigate the Ensemble Learning technique using the Stacking method, supported by the Support Vector Machine (SVM) and k-Nearest Neighbour (kNN) algorithms aiming at an optimization of the results for DDoS attack detection. For this, the Intrusion Detection System concept was used with the application of the Data Mining and Machine Learning Orange tool to obtain better results

translated by 谷歌翻译

AI applications in forest monitoring need remote sensing benchmark datasets

Emily R. Lines , Matt Allen , Carlos Cabo , Kim Calders , Amandine Debus , Stuart W. D. Grieve , Milto Miltiadou , Adam Noach , Harry J. F. Owen , Stefano Puliti

分类：人工智能

2022-12-20

With the rise in high resolution remote sensing technologies there has been an explosion in the amount of data available for forest monitoring, and an accompanying growth in artificial intelligence applications to automatically derive forest properties of interest from these datasets. Many studies use their own data at small spatio-temporal scales, and demonstrate an application of an existing or adapted data science method for a particular task. This approach often involves intensive and time-consuming data collection and processing, but generates results restricted to specific ecosystems and sensor types. There is a lack of widespread acknowledgement of how the types and structures of data used affects performance and accuracy of analysis algorithms. To accelerate progress in the field more efficiently, benchmarking datasets upon which methods can be tested and compared are sorely needed. Here, we discuss how lack of standardisation impacts confidence in estimation of key forest properties, and how considerations of data collection need to be accounted for in assessing method performance. We present pragmatic requirements and considerations for the creation of rigorous, useful benchmarking datasets for forest monitoring applications, and discuss how tools from modern data science can improve use of existing data. We list a set of example large-scale datasets that could contribute to benchmarking, and present a vision for how community-driven, representative benchmarking initiatives could benefit the field.

translated by 谷歌翻译

LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer

Ning Yu , Chia-Chih Chen , Zeyuan Chen , Rui Meng , Gang Wu , Paul Josel , Juan Carlos Niebles , Caiming Xiong , Ran Xu

分类：计算机视觉

2022-12-19

Graphic layout designs play an essential role in visual communication. Yet handcrafting layout designs are skill-demanding, time-consuming, and non-scalable to batch production. Although generative models emerge to make design automation no longer utopian, it remains non-trivial to customize designs that comply with designers' multimodal desires, i.e., constrained by background images and driven by foreground contents. In this study, we propose \textit{LayoutDETR} that inherits the high quality and realism from generative modeling, in the meanwhile reformulating content-aware requirements as a detection problem: we learn to detect in a background image the reasonable locations, scales, and spatial relations for multimodal elements in a layout. Experiments validate that our solution yields new state-of-the-art performance for layout generation on public benchmarks and on our newly-curated ads banner dataset. For practical usage, we build our solution into a graphical system that facilitates user studies. We demonstrate that our designs attract more subjective preference than baselines by significant margins. Our code, models, dataset, graphical system, and demos are available at https://github.com/salesforce/LayoutDETR.

translated by 谷歌翻译